-
Notifications
You must be signed in to change notification settings - Fork 48
Full support for DB numbers other than zero in cluster mode. #410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Fixes issue #204 |
Signed-off-by: Allen Samuels <[email protected]>
Signed-off-by: Allen Samuels <[email protected]>
Signed-off-by: Allen Samuels <[email protected]>
Signed-off-by: Allen Samuels <[email protected]>
Signed-off-by: Allen Samuels <[email protected]>
Signed-off-by: Allen Samuels <[email protected]>
Signed-off-by: Allen Samuels <[email protected]>
src/schema_manager.h
Outdated
| The SchemaManager uses the MetadataManager to manage a single name space of | ||
| indexes, i.e., map<string, GlobalMetadataEntry>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appreciate the thought going into this. The interoperability is a headache
One question: when someone creates an index in db_num > 0 on Valkey 9, how would a shard running on Valkey 8 behave? I am guessing it would crash, right? I guess that behavior is fine, but we should document this
Another option I guess is that we have namespaces in the metadata manager. Right now we use "vs_index_schema". We could move to "vs_index_schema2"? Nodes on the old versions would just not get the index definitions for those created with vs_index_schema2, which should be fine, and we can register for callbacks of "vs_index_schema" in the new node for backwards compat. Probably a bit more work, but it would prevent the Valkey 8.0 nodes from crashing in such cases
Signed-off-by: Allen Samuels <[email protected]>
Signed-off-by: Allen Samuels <[email protected]>
Signed-off-by: Allen Samuels <[email protected]>
Signed-off-by: Allen Samuels <[email protected]>
Signed-off-by: Allen Samuels <[email protected]>
Signed-off-by: Allen Samuels <[email protected]>
Signed-off-by: Allen Samuels <[email protected]>
|
Revised to reject RDB or cross-shard metadata that contains any object with a semantic version higher than what's understood by the current code. |
Signed-off-by: Allen Samuels <[email protected]>
Signed-off-by: Allen Samuels <[email protected]>
Signed-off-by: Allen Samuels <[email protected]>
| The mapping is done by manipulating the object name. It is known that | ||
| pre-Valkey 9 object names cannot contain a valid hash-tag. This provides the | ||
| necessary information to distinguish between an object name and an encoded | ||
| <db_num, object_name> pair. Objects with a db_num of 0 are stored the same | ||
| way as pre-Valkey 9 objects, providing backward compatibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have the encoding version added - could we just add a new field to the IndexSchema proto (e.g. clustered_db_num)? Old versions would silently drop it, but newer versions would support it.
This way, db_num can be removed from the metadata manager entirely (not encoded in the name and not added as a middle layer of the table).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would disallow two indexes in different db's to have the same name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The underlying problem is that MetadataManager has a two-level naming scheme and now we need a three-level naming scheme. Plus, this stuff is preserved in the RDB file, so there's a backward compatibility issue. We can certainly move to a three-level scheme with conversion from old code. But that's a lot of changes to MetadataManager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another scheme would be to reuse the top-level naming for each DB. That's a change in the philosophy of MetadataManager as it would require dynamic registration of the type level types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, yeah that makes sense, was missing that part.
If I were redesigning it - the metadata entry would be indexed by a UUID instead of just the index name. The UUID could be computed by the original creator.
Is there some way we could move to a UUID based scheme now? I see two ways:
- Leave the existing metadata type namespace, and just change the schema name to be UUID indexed:
If we assume the new UUIDs are unique (they should be, or at least have an astronomically low chance of collision) then they should be able to merge cleanly with the existing IDs. We would just make the "ID" for metadata manager work like: DB!=0 -> UUID, DB==0 -> index name
Reading the code, the only place the ID is used in the Schema Manager is we use this to delete the existing index before processing the newer version of the index. So it shouldn't cause a crash. If you do create an index on DB!=0 when the cluster has nodes on the older version of Valkey-Search, it should work okay actually. If you later went to delete it, it would skip the delete since it would not find the index by the ID (which is now a UUID), which I guess is okay since it isn't even accessible on that node (it is just wasted resources). If you were to recreate it - it would log an error and fail to process - which I also think is okay since it is inaccessible on that node. Once you upgrade to the latest version, it should be able to handle the metadata (maybe we need to add some MetadataManager <-> SchemaManager reconciliation on load)
- Add a new metadata type namespace for UUID-indexed:
This would separate the two indexing schemes into two high level types, the old one, which is used for DB=0, and the new one, which is used for DB!=0.
Nodes on the old versions would just not apply the metadata for unknown types. So this works quite cleanly. Once they upgrade, ideally they would then apply the metadata
What do you think? 1 is less clean on the older version nodes, but this already feels like an edge case. As long as we can get it to not crash and it would eventually recover when fully upgraded, I am okay with some odd behaviors on the old nodes. But 2 should be quite clean, I just don't know if we want to support this two-type setup long term (it is complexity that we later need to clean up)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today, the MetadataManager takes care of resolving consistency issues. In order to support UUID naming, it would have to have some way to determine if a newly received UUID conflicts with some existing UUID. That would require that registered types have some mechanism that would assist this. In other words a client would be queried with a new protobuf and would have to return some UUID that conflicted with it. That seems worse to me, the value of MetadataManager is that he understands the objects and performs collision resolution.
Here, the problem is fundamentally naming. If we confine the discussion to one registered type, then the current system has a flat namespace. I think you either redesign MetadataManager to have a more hierarchical namespace OR you encode the hierarchy into the existing flat namespace. I've chosen the second path -- because I believe you were on the right path before, but the addition of the dbnum, was just unexpected. (BTW, we'll face exactly the same issues when we implement aliasing -- which I plan to do soon).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is fundamentally about deriving some "primary key" from the index metadata that we can use to uniquely identify an index. It seems like a fundamental inflexibility that the primary key cannot be updated. That's fine as long as we are sure it will never really update. If dbnum is some once-in-a-project problem, then I am okay with what you have proposed.
But - if we need to add another field to the primary key - it seems like we won't be able to take the same path out again, right?
How about this: could we use a proto for the primary key?
message IndexMetadataKey {
string name = 1;
int db_num = 2;
...
}
So we would then serialize this and use this as the key. If we ever add a new field to the key, it would be cross version compatible as long as the default value for that field is handled. It is a similar option to UUID, but it guarantees two semantically identically indexes collide and are reconciled by the metadata manager
fixes #204